10 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationThe bandwidth requirement for each link on a network-on-chip (NoC) may differ based on topology and traffic properties of the IP cores. Available bandwidth on an asynchronous NoC link will also vary depending on the wire length between sender and receiver. This work explores the benefit to NoC performance, area, and energy when this property is used to optimize bandwidth on specific links based on its bandwidth required by a target SoC design. Three asynchronous routers were designed for implementing of asynchronous NoCs. Simple routing scheme and single-flit packet format lead to performance- and area-efficient router designs. Their performance was evaluated in consideration of link wire delay. Comprehensive analysis of pipeline latch insertion in asynchronous communication links is performed in regard to link bandwidth. Optimal placement of pipeline latch for maximizing benefit to increase of bandwidth is described. Specific methods are proposed for performance, area and energy optimization, respectively. Performance optimization is achieved by increasing bandwidth of high trafficked and high utilized links in an NoC, as inserting pipeline latches in those links. Through decrease of bandwidth of links with low traffic and low utilization by halving data-path width, reduction of wire area of an NoC is accomplished. Energy optimization is performed using wide spacing between wires in links with high energy consumption. An analytical model for asynchronous link bandwidth estimation is presented. It is utilized to deploy NoC optimization methods as identifying adequate links for each optimization method. Energy and latency characteristics of an asynchronous NoC are compared to a similarly-designed synchronous NoC. The results indicate that the asynchronous network has lower energy, and link-specific bandwidth optimization has improved NoC performance. Evaluation of proposed optimization methods by employing to an asynchronous NoC shows achievements of performance enhancement, wire area reduction and wire energy saving

    Bandwidth optimization in asynchronous NoCs by customizing link wire length

    Get PDF
    Journal ArticleThe bandwidth requirement for each link on a network-on-chip (NoC) may differ based on topology and traffic properties of the IP cores. Available bandwidth on an asynchronous NoC link will also vary depending on the wire length between sender and receiver. We explore the benefit to NoC performance when this property is used to increase bandwidth on specific links that carry the most traffic of an SoC design. Two methods are used to accomplish this: specifying router locations on the floorplan, and adding pipeline latches on long links. Energy and latency characteristics of an asynchronous NoC are compared to a similarly-designed synchronous NoC. The results indicate that the asynchronous network has lower energy, and link-specific bandwidth optimization has improved the average packet latency. Adding pipeline latches to congested links yields the most improvement. This link-specific optimization is applicable not only to the router and network we present here, but any asynchronous NoC used in a eterogeneous SoC

    The Future of Formal Methods and GALS Design

    Get PDF
    AbstractThe System-on-Chip era has arrived, and it arrived quickly. Modular composition of components through a shared interconnect is now becoming the standard, rather than the exotic. Asynchronous interconnect fabrics and globally asynchronous locally synchronous (GALS) design has been shown to be potentially advantageous. However, the arduous road to developing asynchronous on-chip communication and interfaces to clocked cores is still nascent. This road of converting to asynchronous networks, and potentially the core intellectual property block as well, will be rocky. Asynchronous circuit design has been employed since the 1950's. However, it is doubtful that its present form will be what we will see 10 years hence. This treatise is intended to provoke debate as it projects what technologies will look like in the future, and discusses, among other aspects, the role of formal verification, education, the CAD industry, and the ever present tradeoff between greed and fear

    Comparing Energy and Latency of Asynchronous and Synchronous NoCs for Embedded SoCs

    No full text
    Abstract—Power consumption of on-chip interconnects is a primary concern for many embedded system-on-chip (SoC) applications. In this paper, we compare energy and performance characteristics of asynchronous (clockless) and synchronous networkon-chip implementations, optimized for a number of SoC designs. We adapted the COSI-2.0 framework with ORION 2.0 router and wire models for synchronous network generation. Our own tool, ANetGen, specifies the asynchronous network by determining the topology with simulated-annealing and router locations with force-directed placement. It uses energy and delay models from our 65nm bundled-data router design. SystemC simulations varied traffic burstiness using the self-similar b-model. Results show that the asynchronous network provided lower median and maximum message latency, especially under bursty traffic, and used far less router energy with a slight overhead for the interrouter wires. I

    Performance evaluation of elastic gals interfaces and network fabric

    Get PDF
    This paper reports on the design of a test chip built to test a) a new latency insensitive network fabric protocol and circuits, b) a new synchronizer design, and c) how efficiently one can synchronize into a clocked domain when elastic interfaces are utilized. Simulations show that the latency insensitive network allows excellent characterization of network performance in terms of the cost of routing, amount of blocking due to congestion, and message buffering. The network routers show that peak performance near 100 % link utilization is achieved under congestion and combining. This enables accurate high-level modeling of the behavior of the network fabric so that optimized network design, including placement and routing, can occur through high-level network synthesis tools. The chip also shows that when elastic interfaces are used at the boundary of clock synchronization points then efficient domain crossings can occur. Buffering at the synchronization points are required to allow for variability in clocking frequencies and correct data transmission. The asynchronous buffering and synchronization scheme is shown to perform over four times faster than the clocked interface
    corecore